THE CYCLE CONVERGENCE OF RESTARTED GMRES FOR 
NORMAL MATRICES IS SUBLINEAR 
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Abstract. We prove that the cycle— convergence of the restarted GMRES applied to a system 
of linear equations with a normal coefficient matrix is sublinear. 
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1. Introduction. The generalized minimal residual method (GMRES) was orig- 
inally introduced by Saad and Schultz |12j in 1986, and has become a popular method 
for solving non-Hermitian systems of linear equations 

(l.l) Ax = b, AeC nxn , beC n . 

GMRES is classified as a Krylov subspace (projection) iterative method. At every 



new iteration i, GMRES constructs an approximation xi to the exact solution of ( 1.1 1, 
such that the 2-norm of the corresponding residual vector 7"$ = b — Axi is minimized 
over the affine space ro + AK-i (A, rrj), i.e. 

(1.2) Ti= min Iko — Au\\, 

where /Q (A, r ) is the i-dimensional Krylov subspace 

Ki (A, r ) = span{r , Ar , . . . , A* -1 r } 
induced by the matrix A and the initial residual vector rrj = b — Ax with x being 



an initial approximate solution of ( 1.1 ) 



As usual, in a linear setting, a notion of minimality is adjoint to some orthogo- 



nality condition. In our case, the minimization (1.2 1 is equivalent to forcing the new 
residual vector to be orthogonal to the subspace AICi (A, ro) (also known as the 
Krylov residual subspace). In practice, for a large problem size, the latter orthogonal- 
ity condition results in a costly procedure of orthogonalization against the expanding 
Krylov residual subspace. Orthogonalization together with storage requirement makes 
the GMRES method complexity and storage prohibitive for practical application. A 
straightforward treatment for this complication is the so-called restarted GMRES [12 . 

The restarted GMRES, or GMRES (m), is based on restarting GMRES after every 
to iterations. At each restart, we use the latest approximate solution as the initial 
approximation for the next GMRES run. Within this framework a single run of to 
GMRES iterations is called a GMRES(m) cycle, to is called the restart parameter. 
Consequently, restarted GMRES can be regarded as a sequence of GMRES (to) cycles. 
When the convergence happens without any restart occurring, the algorithm is known 
as full GMRES. 

Dealing with the restarted GMRES, our interest will shift towards the residual 
vectors r^ at the end of every fc-th GMRES(m) cycle (as opposed to the residual 



vectors (1.2 1 at each iteration of the original algorithm). 
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Definition 1 (cycle-convergence). We define the cycle-convergence of restarted 
GMRES (m) as the norm of the residual vectors \\r^\\ at the end of every k-th 
GMRES(m) cycle. 

We note that each satisfies the local minimality condition 

(1.3) r k = min ||r fc _i - Au\\, 

u£K m (A,r k _ 1 ) 

where JC m (A, r^_i) is the m-dimensional Krylov subspace produced at the fc-th GMRES(m) 
cycle, 

(1.4) K m (A,r k -i) = spaa{rk-i,Ark-x, ■ ■ ■ ,A m ~ x rk-x}- 



The price paid for the reduction of the computational work, as follows from ( 1.3 1 
and (1.4 1, is the loss of global optimality (1.2 1. Although ( |1.3| implies a monotonic 



decrease of the norms of the residual vectors r k , GMRES(m) can stagnate [HJ IT7] . 
This is in contrast with full GMRES which is guaranteed to converge to the exact 



solution of ( 1.1 1 in n steps (assuming exact arithmetic). However, a proper choice of a 
preconditioncr or/and a restart parameter, e.g. [SJ [51 [H] , can significantly accelerate 
the convergence of GMRES(m), thus making the method practically attractive. 

While a lot of efforts have been put into the characterization of the convergence 
of full GMRES, e.g. HI 19 Ell HI US], our understanding of the behavior of 
GMRES(m) is far from complete, leaving us with more questions than answers, e.g. [5]. 
In this manuscript, we prove that the cycle-convergence of restarted GMRES for 
normal matrices is sublinear. This statement means that the reduction in the norm 
of the residual vector at the current GMRES (m) cycle cannot be better than the 
reduction at the previous cycle. 

The current manuscript was inspired by ideas introduced in the technical report 
[IB] by I. Zavorin. In this work the author shows that, at every step of GMRES, 
a diagonalizable matrix A and its Hermitian transpose A H yield the same worst- 
case behavior, and derives a necessary condition (the so-called cross-equality) for the 
worst-case right-hand side vector. We inherit the mathematical tools for our analysis 
from [IB] , as well as [101 [17], and give their brief description, slightly adapted to the 
case of the restarted GMRES and a normal matrix A, in Section 2. The main result 
of the sublinear cycle-convergence is proved in Section 3. In Section 4, the behavior 
of GMRES(m) in the nonnormal case is discussed. 

2. Krylov matrix, its pseudoinverse and spectral factorization. Through- 
out the manuscript we will assume (unless otherwise explicitly stated) A to be non- 
singular and normal, i.e. A allows the decomposition 

(2.1) A=VAV H , 

where A e C" xn is a diagonal matrix with the diagonal elements being the nonzero 
eigenvalues of A, and V £ C nx ™ j s a unitary matrix of the corresponding eigenvectors. 

Let us denote the fc-th cycle of GMRES (m) applied to the system ( |1.1[ ) with the 
initial residual vector r k -i as GMRES (.A, m, Tk-i), I < m < n — 1. We assume that 
the residual vector r&, produced at the end of GMRES(A, m, r^-i), is nonzero. 



Accordin g to (1.3 1 a run of GMRES(A, m, rk-i) entails the Krylov subspace 
K m (Arfc-i)@. For each K m (A, ru-i) we define a matrix K (A, rk-i) £ C nx ( m+1 \ 
such that 

(2.2) K(A,r k -i) = [r k -i Ar k -i ... A m r k -i] , k = 1, 2, . . . , q, 



CYCLE-CONVERGENCE OF RESTARTED GMRES FOR NORMAL MATRICES 



3 



where q is the total number of GMRES(m) cycles. 

The matrix (2.2) is called the Krylov matrix. We will say that K (A, r k -i) corre- 
sponds to the cycle GMRES(^4, m, r k -i). Note that the columns of K (A, r k -\) span 
the next, (m + l)-dimensional, Krylov subspace JC m +i(A, r k -i). By the assumption 
that r k ^ 0, 

rank (K (A, rk-i)) = m + 1. 

This latter equality allows us to introduce the Moore-Penrose pseudoinverse of 
the matrix K {A, r k -i), 

Jft [A, r fe _0 = (K H (A, r k -i) K (A, r^))' 1 K H (A, r fc _i) G C^ 1 )*", 

which is well-defined and unique. The following lemma shows that the first column 
of (iff (A, rfe_i)J is the next residual vector r k up to a scaling factor. 

Lemma 2. Given A G C" xn fnoi necessarily normal) and the full rank Krylov 
matrix K (A, r^-i) G C nx ' m " 1 " 1 ^, corresponding to the cycle GMRES(A, m, r k ^\) for 
any k = 1, 2, . . . , q. Then 

(2-3) (ift(Arn)f ei = ^r t , 



where e\ = [1 ... 0] T G 

Proo/. See Ipsen pH Theorem 2.1], as well as [21 [13]. □ 
Another important idea, mentioned in |10j and intensively used in |16l I17j . pro- 
vides the so-called spectral factorization of the Krylov matrix K (A, r k _i) into three 
components, each one encapsulating separately the information on eigenvalues of A, 
its eigenvectors and the previous residual vector r k -i. 

Lemma 3. Let A G C nxn satisfying (2.1). Then the Krylov matrix K(A,r k -i), 
for any k = 1, 2, . . . q, can be factorized as 

(2.4) K(A,r k -x)=VD k -iZ, 

where d k -i = V H r k ^ G C n , P> fe _i = diag{d k - X ) £ C nxn and Z G C" x ( m+1 ) is the 
Vandermonde matrix computed from the eigenvalues of A, 

(2.5) Z=[e Ae ... A m e] , 



D771 + 1 



e 



= [1 1 ... 1] £ K n . 

Proof. Starting from ( |2.1[ ) and the definition of the Krylov matrix (2.2 1 

K(A,T k _{) = [r*_i Ar fc _! ... A m r fc _i] , 

= [VV H r fe _i VAV a r k -i . . . VA m V H r k ^] , 

= V[d k _ 1 A4-i ... A m 4-i], 

= V[D k _ ie AD k ^e ... A ro D fc _ie], 

= 7D fc _ 1 [e Ae ... A m e] = VDk-^. 

□ 

It is clear that the statement of Lemma [3] can be easily generalized to the case of 
a diagonalizable (nonnormal) matrix A providing that we define d k -\ = V r k -\ in 
the lemma. 
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3. The sublinear cycle— convergence of GMRES(?ti). Along with (1.1 1 let 
us consider the system 

(3.1) A H x = b 



with the matrix A replaced by its Hermitian transpose. Clearly, according to (2.1) 
(3.2) A H = VAV H . 



It turns out that m steps of GMRES applied to the systems ( |1.1| and (3.1l 



produce the residual vectors of equal norms, provided that the initial residual vector 
is the same for both GMRES runs. This observation is crucial in concluding the 
sublinear cycle-convergence of GMRES(m) and is formalized in the following lemma. 

Lemma 4. Let r m and f m be the nonzero residual vectors obtained by applying 
m steps of GMRES to the systems \l-ty and (3.1) respectively, 1 < m < n — 1. Th 



en 



provided that the initial approximate solutions of (1.1) and (3.1) induce the same 
initial residual vector r . 

Proof. Consider a polynomial p(z) € V m , where V m is the set of all polynomials 
of degree at most m defined on the complex plane, such that p(0) = 1. Let rp be a 



nonzero initial residual vector for the systems (1.1 1 and ( |3.1[ ) simultaneously. Since 
the matrix A is normal, so is p(A), thus p{A) commutes with its Hermitian transpose 
P H {A). We have 

||p(A)r || 2 = (p(A)r Q ,p(A)r ) = (r ,p H (A)p(A)r ), 

= (r ,p(A)p H (A)r ) = (p H (A)r ,p H (A)r ), 

= ((Vp(A)V H ) H r 0l (Vp(A)V H ) H r ) = (Vp(K)V H r 0l Vp(K)V H r ), 
- (p(VAV H )r ,p(VAV H )r ) = ||p(yA^)r || 2 , 

where p{z) € V m is the polynomial obtained from p{z) by conjugating its coefficients. 
By (|3.2[) we conclude that 



Ib^Jroll = ||p(^)r ||. 

Since the last equality holds for any p(z) E V m it will also hold for the (GMRES) 
polynomial p m (z), which minimizes ||p(yl)ro|| over V m - This polynomial exists and is 
unique [9, Theorem 2]. Thus, 

Ikmll = min \\p(A)r \\ = \\ Pm (A)r \\ = \\p m (A H )r a \\ , 
= min \\p(A H )r \\ = \\f m \\, 

which proves the lemma. Moreover, we note that the two GMRES polynomials con- 
structed after m steps of GMRES applied to (1.1 1 and ( |3.1| with the same initial 



residual vector are the same up to the complex conjugation of coefficients. □ 
In the framework of the restarted GMRES Lemma [4] suggests that the cycles 
GMRES(^4, m, r^-i) and GMRES(A- ff , m, r^-i) result in the residual vectors and 
r k of the same norm. 



CYCLE-CONVERGENCE OF RESTARTED GMRES FOR NORMAL MATRICES 



5 



So far we are ready to state the main theorem. 

Theorem 5 (The sublinear cycle-convergence of GMRES(m)). Let r k be a 
sequence of nonzero residual vectors produced by GMRES (m) applied to the system 
(1.1) with a nonsingular normal matrix A £ C nxn , 1 < m < n — 1. Then 



(3.3) 



INI < K+il 



k = i,...,q-i, 



where q is the total number of GMRES (m) cyc les. 

Proof. Left multiplication of both parts of (2.3) by K H [A, r k -\) leads to 



ei 



,K H {A,r k - X )r k . 



By (2.4| in Lemma [3] we factorize the Krylov matrix K(A,r k -i) in the equality 
above: 



1 



ei 



(VDk^zfn 



WnF 
l 



Z H D k _ 1 V"r k , 



H „ 



r k \ 



Z H D k _ l d k 



r k 



Applying complex conjugation to this equality (and observing that e\ is real), we get 

1 



ei = 



r k 



;Z 2 D k _ Y d k . 



According to the definition of D k ^i in Lemma[3] D k _id k = D k d k -i, thus 

1 „^ , 1 



ei 



r k 



; Z T D k d k _\ 



From (2.4 1 and (3.2) we notice that 



Z T D k V H =(VD k Z) H = K H (A H ,r k ), 
which leads to the following equality 

(3.4) ei ' 



\ r k\ 



K H (A H , r k ) r k -i 



Considering the residual vector r k _\ as a solution of the underdetermined system 

(3.4) , we can represent the latter as 

(3.5) r fe _ x = KH 2 (K H (A H , r k )y e x + w k , 
where w k e null (K H (A H ,r k )). Moreover, since 

w k ±(K H (A H ,r k )) ] e u 
by the Pythagorean theorem we obtain 

llr^ll 2 = |W| 4 || (i^ (A H ,r k ))' ei W 2 +\\w k \\ 2 , 




Fig. 1. Cycle-convergence of GMRES(5) applied to a 100-by -100 normal matrix. 



now since (K H (A H , r fc )) T = (i^ (A H , r k )) ) , we get 

lk-i|| 2 = ||r fe || 4 || (rf (A H ,r k )) H ei || 2 + || Wfc || 2 , and then by @) 
IN| 4 



> 



l^+ill 2 
||r fc || 4 

l**+i|| 2! 



|wfe|| 2 , 



where f^+i is the residual vector at the end of the cycle GMRES(A H , m, rk). Finally 



|r fe || 2 < ||r fc || 2 ||f fe+1 || 2 _ ||f fc+1 || 2 



\r k -i\\ 2 ~ \\r u \\* llrjp 



so that 



(3.6) J™L < J!^±li. 

By Lemma |4j the norm of the residual vector fk+i at the end of the cycle 
GMRES(^4 W , m, rk) is equal to the norm of the residual vector rk+i at the end 
of the cycle GMRES(A, m, rk), which completes the proof of the theorem. □ 
Geometrically the theorem suggests that any residual curve of a restarted GM- 
RES, applied to a system with a nonsingular normal matrix, is nonincreasing and 
concave up (Figure [T]). 

From the proof of the Theorem [5] it is clear that, for a fixed k, the equality in 



(3.3) holds if and only if the vector Wk (|3.5[) from the null space of the corresponding 



matrix K [A ,rk) is zero. In particular, when the restart parameter is chosen to 



be one less than the problem size, i.e. m = n — 1, the matrix K [A ,rk) in ( 3.4 1 
becomes an n-by-n nonsingular matrix, hence with a zero null space, and thus the 



Inequality (3.3) is indeed an equality when m = n — 1. 



It turns out that the cycle-convergence of GMRES(n — 1), applied to the system 
(1.1) with a nonsingular normal matrix A, can be completely determined by norms 
of the two initial residual vectors r and r% . 

Corollary 6 (The cycle-convergence of GMRES(n— 1)). Given \\r \\ and \\ri\\. 
Then, under assumptions of the Theorem [5[ norms of the residual vectors rk at the 
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end of each GMRES(n — 1) cycle obey the following formula 

k 



(3.7) 



kfc+ill 



ri 



\ri\ 
\ro\ 



1, 



Proof. The representation (3.5) of the residual vector rk-i, for m = n — 1, turns 



into 
(3.8) 

implying, by the proof of the Theorem [5] that the equality in (3.3) holds at each 
GMRES (n - 1) cycle. Thus, 



r fe _! = \\r k f {K H (A H ^r k )Y 1 e u 



lkfc+i|| = ||r fe || 



Rfe-l 



fc= !,...,£?- 1. 



We show (3.7) by induction in k. Using the formula above, it is easy to verify 



and ||rfe|| can also be computed by (3.7 1. Then 



( |3.7[ ) for ||r 2 || and ||r 3 || (k = 1,2). Let's assume that for some k, 3 < k < q — 1, ||r*fc_i | 



fc-i 



= ri 



rfcii 
lkfc-i| 

JWI 
Ikoll 



inl 

kol 



fc-i 



fe-2 



fc-1 



kol 



= II 'T 



k^ 

kol 



Thus, (3.71 holds for all k = 1, . . . , q - 1. 



□ 



Another observation in the proof of the Theorem [5] leads to a well known result 
due to Baker, Jessup and Manteuffel [1 . In this paper, the authors prove that, when 
GMRES(n — 1) is applied to a system with Hermitian or skew-Hermitian matrix, 
the residual vectors at the end of each restart cycle alternate direction in a cyclic 
fashion [TJ Theorem 2]. In the following corollary we (slightly) refine this result by 
providing the exact expression for the constants a k in [H Theorem 2]. 

Corollary 7 (The alternating residuals) . Let r k be a sequence of nonzero resid- 
ual vectors produced by GMRES (n — 1 ) applied to the system (1.1) with a nonsingular 
Hermitian or skew-Hermitian matrix A £ C nxn . Then 



(3.9) 



Tk+l = OtkTk-1, 



a k 



kfc+il 

Kir 



G(0,1] 



1,2,. 



Proof. For the case of a Hermitian matrix A, i.e. A H = A, the proof follows 
directly from (3.8 1 and (2.3 1. 



Let A be skew-Hermitian, i.e. A H = -A. Then, by (3.8 1 and (2.3 1, 



r fe _! = (K H (A H ,r k )y 1 e 1 = (K H (-A,r fe )) _1 ei 



IHI 2 
kfc+il 



where fk+i is the residual vector produced at the end of the cycle GMRES(— A, n—1, 
rk)- 

According to (1.3 1, the residual vectors rk+i and fk+i at the end of the cycles 
GMRES(A n—1, r k ) and GMRES(-A, n - 1, r k ) are obtained by orthogonaliz- 
ing rk against the Krylov residual subspaces AJC n -i (A,rk) and (— A) /C n _i (— A,Tk) 
respectively. But {—A)K n -i (-A,r k ) = AK n -\ (A,r k ), hence f k+ i = r k+1 . □ 




Fig. 2. Cycle-convergence of GMRES(5) applied to a 100-by-100 diagonalizable (nonnormal) 
matrix. 



4. Note on the departure from normality. In general, for systems with 
nonnormal matrices, the cycle convergence behavior of the restarted GMRES is not 
sublincar. In Figure [2j we take a nonnormal diagonalizable matrix for illustration 
purpose and one can observe the claim. Indeed, for nonnormal matrices, it has been 
observed the cycle-convergence of restarted GMRES can be superlinear [IS] , 

In this concluding section we restrict our attention to the case of a diagonalizable 
matrix A, 

(4.1) A = VAV-\ A H = V~ H AV H . 

The analysis performed in Theorem [5] can be generalized for the case of a diago- 



nalizable matrix ([16), resulting in the inequality (3.6 1. However, as we depart from 
normality, Lemma [4] fails to hold and the norm of the residual vector fk+i at the 
end of the cycle GMRES (A H , m, r k ) is no longer equal to the norm of the vector 
7'fc+i at the end of GMRES(A, m, ffc). Moreover, since the eigenvectors of A can be 



significantly changed by the Hermitian conjugation, as (4.1 ) suggests, the matrices A 
and A H can have almost nothing in common, so that the norms of fk+i and r k +i are, 
possibly, far from being equal. This gives a chance for breaking the sublinear con- 
vergence of GM RES (m), provided that the subspace AK. m (A,rk) results in a better 
approximation (1.3) of the vector than the subspace A H K m (A H ,ru). 



It is natural to expect that the convergence of the restarted GMRES for "almost 
normal" matrices will be "almost sublincar" . We quantify this statement in the 
following lemma. 

Lemma 8. Let r k be a sequence of nonzero residual vectors produced by GMRES (m) 



applied to the system (1.1) with a nonsingular diagonalizable (4-D matrix A € C" xn , 
1 < m < n — 1. Then 

, A ^ INI ^ «(lkfe+ill +Pk) , , , 

( 4 - 2 ) n m - ii — m ' k = l,...,q-l, 

lFfc-i|| INI 

where a = gi 1 (y) > @ k ~ \\Pk{A)(I — VV H )rk\\, Pk{z) is the polynomial constructed at 
the cycle GMRES(A, m, r^), and where q is the total number of GMRES(m) cycles. 
Note that as V H V — > I, < a — ► 1 and < (3 k — ► 0. 
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Proof. Consider the norm of the residual vector fk+i at the end of the cycle 
GMRES (^4 ff , m, r k ). 

\\r k +i\\ = min \\p(A H )r k \\ < \\p{A H )r k \\, 

where p{z) G V m is any polynomial of degree at most m, such that p(0) — 1. Then, 
using (4.1 1, 



Ffc+i 



< 



< 



\\p(A H ) rk \\ 

\\V- H p(A)V H r k \\ 

\\V- H p(A)(V- 1 V)V H r k \\ 

\\V- H p(K)V-\VV H )r k \\ 

\\V- H p{k)V- l {I - {I -VV H ))r k \\ 

\\V- H p(K) {V~ l r k - V~\I - VV H )r k ) \\ 

\\V- H \\\\p(K) (V~ l r k - V-\l-VV H )r k ) || 



Note that 

HKA) (V-'n - V-\I - VV H )r k ) || = ||p(A) {V-\ k - V-\I - VV H )r k ) ||. 
Thus, 

||r fc+1 || < \\V- H \\\\p{K) (y- l r k -V-\l-VV H )r k )\\ 

= ||^- ff ||||(F-V)p(A) {V-\ k - V~\I - VV H )r k ) || 

< \\v- H \\\\v- 1 \\\\Vp{k)v- 1 r k -Vp{k)v-\i -vv H )r k \\ 

' -WpiVAV- 1 )^ -piVAV- 1 )^ -VV H )r k \\ 



< 



raxn 
1 



(vy 



{\\p(A)r k \\ + \\p(A)(I-VV H )r k \\), 



where a rnin is the smallest singular values of V. 

Since the last inequality holds for any polynomial p(z) G V m , it will also hold for 
p(z) — p k (z), where p k (z) is the polynomial constructed at the cycle GMRES(v4, m, 
r k ). Hence, 



ll**+i||< 



<J 2 (V) 

min \ J 



(\\r k+1 \\ + \\ Pk (A)(I-VV H )r k \\) 



Setting a 



Pk = \\p k {A)(I-VV H )r k \\ and observing that a — ► 1, k 



as V H V — > I, from (3.6 1, we obtain (4.2 1. 



□ 
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